-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature/semantic-text] semantic text copy to support #106689
[feature/semantic-text] semantic text copy to support #106689
Conversation
…erver (elastic#105012)" This reverts commit f4d3ab9.
…copy-to-support-inference # Conflicts: # server/src/main/java/org/elasticsearch/action/bulk/BulkShardRequestInferenceProvider.java # server/src/test/java/org/elasticsearch/action/bulk/BulkOperationTests.java
…ject mappers can provide underlying field mappers
…s and object mappers can provide underlying field mappers" This reverts commit 05aa06f.
…copy-to-support-inference # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/InferenceResultFieldMapper.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextModelSettings.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/mapper/InferenceResultFieldMapperTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapperTests.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/10_semantic_text_inference.yml # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/20_semantic_text_field_mapper.yml
I'm dropping support for multifields after our conversations, so this PR is now adding just @jimczi , @Mikep86 - should we double check that all source fields are provided when doing an update, so we can redo inference on them? I would say no for the moment. I can think of having source fields that are potentially null, and thus users would not have an easy way out of it. We could include docs that include the limitations for |
I agree. Partial update handling has some more moving parts, such as attributing each chunk to the source that generated it, that are best left to a separate PR. |
@@ -1030,7 +1030,7 @@ public final void testMinimalIsInvalidInRoutingPath() throws IOException { | |||
} | |||
} | |||
|
|||
private String minimalIsInvalidRoutingPathErrorMessage(Mapper mapper) { | |||
protected String minimalIsInvalidRoutingPathErrorMessage(Mapper mapper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to take back this to protected so it can be overriden by our tests
item.id(), | ||
new FieldInferenceResponseAccumulator( | ||
String fieldName = entry.getKey(); | ||
for (var sourceField : entry.getValue().sourceFields()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we iterate on all the source fields for retrieving inference - so we're adding an additional loop here
Map<String, Object> fieldMap = new LinkedHashMap<>(); | ||
fieldMap.put(INFERENCE_ID, model.getInferenceEntityId()); | ||
|
||
Map<String, Object> fieldMap = (Map<String, Object>) inferenceMap.computeIfAbsent(field, s -> new LinkedHashMap<>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple inference results can be applied to a single field from each source field - so we need to be prepared for that
@@ -310,3 +310,38 @@ setup: | |||
id: doc_1 | |||
body: | |||
non_inference_field: "non inference test" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other tests come to mind?
@elasticmachine run elasticsearch-ci/part-2 |
Pinging @elastic/es-search (Team:Search) |
…copy-to-support-inference # Conflicts: # server/src/test/java/org/elasticsearch/index/mapper/FieldTypeLookupTests.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/cluster/metadata/SemanticTextClusterMetadataTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterTests.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/10_semantic_text_inference.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just one question I sign off on it
...ain/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java
Show resolved
Hide resolved
var value = XContentMapValues.extractValue(sourceField, docMap); | ||
if (value == null) { | ||
if (isUpdateRequest) { | ||
addInferenceResponseFailure( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this check for disallowing updates that do not use all source fields. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, except for one edge case: Say that a user wants to use an update request to delete a field from a document. Traditionally, they would do this by explicitly setting the field value to null
. How do we tell the difference between the null
we get when the user didn't set a value and the null
we get when the user explicitly set it?
IMO since we have a way to support partial updates in the near future (chunk source attribution), we could just live with this edge case for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
var value = XContentMapValues.extractValue(sourceField, docMap); | ||
if (value == null) { | ||
if (isUpdateRequest) { | ||
addInferenceResponseFailure( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, except for one edge case: Say that a user wants to use an update request to delete a field from a document. Traditionally, they would do this by explicitly setting the field value to null
. How do we tell the difference between the null
we get when the user didn't set a value and the null
we get when the user explicitly set it?
IMO since we have a way to support partial updates in the near future (chunk source attribution), we could just live with this edge case for now.
Merging this after checking with Jim and Mike. We can change the way we deal with missing source fields later. |
Adds
copy_to
support tosemantic_text
.Changes: